A novel 6-metabolite signature for prediction of clinical outcomes in type 2 diabetic patients undergoing percutaneous coronary intervention

Background Outcome prediction tools for patients with type 2 diabetes mellitus (T2DM) undergoing percutaneous coronary intervention (PCI) are lacking. Here, we developed a machine learning-based metabolite classifier for predicting 1-year major adverse cardiovascular events (MACEs) after PCI among patients with T2DM. Methods Serum metabolomic profiling was performed in a nested case–control study of 108 matched pairs of patients with T2DM occurring and not occurring MACEs at 1 year after PCI, then the matched pairs were 1:1 assigned into the discovery and internal validation sets. External validation was conducted using targeted metabolite analyses in an independent prospective cohort of 301 patients with T2DM receiving PCI. The function of candidate metabolites was explored in high glucose-cultured human aortic smooth muscle cells (HASMCs). Results Overall, serum metabolome profiles differed between diabetic patients with and without 1-year MACEs after PCI. Through VSURF, a machine learning approach for feature selection, we identified the 6 most important metabolic predictors, which mainly targeted the nicotinamide adenine dinucleotide (NAD+) metabolism. The 6-metabolite model based on random forest and XGBoost algorithms yielded an area under the curve (AUC) of ≥ 0.90 for predicting MACEs in both discovery and internal validation sets. External validation of the 6-metabolite classifier also showed good accuracy in predicting MACEs (AUC 0.94, 95% CI 0.91–0.97) and target lesion failure (AUC 0.89, 95% CI 0.83–0.95). In vitro, there were significant impacts of altering NAD+ biosynthesis on bioenergetic profiles, inflammation and proliferation of HASMCs. Conclusion The 6-metabolite model may help for noninvasive prediction of 1-year MACEs following PCI among patients with T2DM. Supplementary Information The online version contains supplementary material available at 10.1186/s12933-022-01561-1.


Definition of clinical characteristics
Current smokers were defined as having smoked ≥ 100 cigarettes in their lifetime and now smoking every day or some days. Hypertension was defined as ongoing therapy for hypertension, systolic blood pressure of ≥ 140mmHg or diastolic blood pressure of ≥ 90mmHg. Dyslipidemia was defined as hypercholesterolemia (serum total cholesterol > 5.72 mmol/L), high levels of low-density lipoprotein cholesterol (> 3.1 mmol/L), low levels of high-density lipoprotein cholesterol (< 0.9 mmol/L), and hypertriglyceridemia (serum triglyceride > 1.70 mmol/L). Peripheral vascular disease was defined as arteries other than coronaries, with exercise-related claudication, revascularization surgery, reduced or absent pulsation, and/or angiographic stenosis of > 50%. Left ventricular end-systolic and end-diastolic volumes were measured using a standard ultrasound machine with a 2.5-MHz probe, and left ventricular ejection fraction was calculated by Simpson biplane method [2].

Angiographic characteristics
Before primary percutaneous coronary intervention (PCI), all participants underwent invasive coronary angiography using the Judkins percutaneous trans-femoral technique. Digital angiograms were reviewed by two expert observers to document lesion characteristics. From the angiograms, moderate (readily visible but mild degree) or severe (obvious, heavy degree) calcification was recorded [3].
Chronic total occlusion was defined as a luminal occlusion in a native coronary artery with no or minimal contrast penetration through the lesion. Multivessel coronary artery disease (CAD) was defined as coronary artery stenosis of ≥ 50% in at least 2 major epicardial coronary arteries. The anatomical complexity of CAD was assessed using the SYNTAX score [4]. Interobserver variability between 2 observers was 0.84 (0.77-0.91), and intraobserver variability was 0.96 (0.93-0.98).

Study outcome
The primary outcome for all datasets was major adverse cardiovascular events (MACEs), a patient-oriented composite endpoint composed of all-cause death, myocardial infarction (MI), stroke, and repeat revascularization. All-cause death was defined as death from any cause. MI was defined as ischaemic signs or symptoms and new pathological Q-waves in ≥ 2 contiguous ECG leads, or/and an elevation CK-MB or troponin above the 99 th percentile limit of normal and at least ≥ 20% above the most recent value. Stroke was defined as a focal neurologic deficit of central origin lasting > 72 h, or a focal neurologic deficit of central origin lasting > 24 h, with imaging evidence of cerebral infarction or intracerebral hemorrhage. Repeat revascularization was defined as any repeat PCI. All stages of a staged index PCI procedure would be considered part of the index revascularization procedure and not a repeated revascularization.
The secondary outcome of the external validation set was target lesion failure, a device-oriented composite endpoint of cardiac death, target vessel-MI, and target lesion revascularization. Cardiac death was defined as any death due to proximate cardiac cause (MI, significant cardiac arrhythmia, refractory congestive heart failure, etc), procedure-related death, and death of unclear cause [5]. Target lesion revascularization was defined as any repeat PCI of the target lesion as a result of restenosis or other complications of the target lesion.

Serum sample preparation
At first, 180 µL of serum samples was mixed with 800 μL of acetonitrile/methanol (1: 1, v/v) and 20 μL of internal standard (L-2-Chlorophenylalanine, 2 µg/mL). Then, the mixture was vortexed for 30 s, sonicated for 10 min, and incubated for 1 h at -20 ℃ to precipitate proteins. After centrifugation at 13,000 g for 15 min at 4 ℃, the supernatant (800 μL) of the mixture was carefully collected, dried under nitrogen, and resuspended in 200 μL of 80% methanol before analysis.

LC-MS analysis
Serum metabolomic profiling was performed on a Vanquish UHPLC system To acquire MS/MS spectra, the column eluent was further detected by means of information-dependent acquisition on a Q Exactive Orbitrap Mass Spectrometer.

Serum sample preparation
Briefly, 100 µL of serum samples was extracted in 400 µL of ice-cold methanol, centrifuged for 13,000 g at 4 ℃ for 10 min, and filtered through 3 kDa membrane cartridges. Sample extracts were then dried under vacuum, reconstituted in 200 µL of 100 mM NH 4 OAc buffer, and capped before analysis.

LC-MS analysis
The targeted metabolite analysis of the 6 selected metabolites was conducted by a 20AD UPLC system (Shimadzu, Kyoto, Japan) coupled with a QTrap 5500 mass spectrometer (SCIEX, Framingham, USA), as described previously [6]. Samples and standards (20 μL) were injected onto a Phenom-exex NH 2 column (150 mm × 2 mm × 3 μm) for metabolite separation. A binary solvent gradient consisting of 5 mM NH 4 OAc (pH: 9.5) adjusted with ammonia (mobile phase A) and acetonitrile (mobile phase B) with a flow rate of 0.25 mL/min. Initial solvent composition at injection was 25% A, followed by a 2-min gradient to 45% A and a fast gradient ramp to 80% A (0.1 min) which was maintained for 5.9 min, A was increased again to 95% (2 min), held for 13 min and then reverted to initial conditions (0.1 min) for equilibration, with a total run time of 30 min.
The column flow was directed into the MS detector operated in the multiple reaction monitoring mode. The ion source parameters were as follows: sheath gas flow rate, 20 Arb; auxiliary gas flow rate, 10 Arb; capillary temperature, 300 ℃; capillary voltage, 4000 V. Other MS/MS settings, such as declustering potential and collision energy, were optimized for each particular metabolite. The concentrations of detected metabolites were obtained from 7-point calibration curves, which were constructed using the peak area ratios (peak area of the metabolite divided by peak area of the isotope-labeled internal standards) of each calibrator versus its concentration. All calibrators were purchased from Sigma (cat # 72340, cat # M4627, cat # C7344, cat # C0254, cat # P2263). All the involved internal standards, including NAM-D4 (for NAM, 1-MNAM, and ADPR, cat # DLM-6883-PK), L-tryptophan-D5  Table S1.

Random forest (RF)
The R package "randomForest" was used for this algorithm. A RF model of the 6 metabolic predictors was trained with 10-fold cross-validation. The 2 important RF parameters, nTree (number of trees to grow for each forest) and mTry (number of predictors sampled for splitting at each node), were set to 500 and the default setting, respectively.

Support Vector Machines (SVM)
The SVM model was fitted with the R package "e1071". The kernel parameter was set as "polynomial", which showed the best predictive performance on the discovery set among all kernel types. We set the "cost" parameter as 0.3, the "gamma" parameter as 1, the "degree" parameter as 4, and the "coef0" parameter as 1.

Deep neural network (DNN)
We trained a deep, multilayer neural network with 2 hidden layers using the R package "keras". We used a rectified linear unit function as the non-linear activation function and a sigmoid function as the classification function. Model optimization was guided using the adaptive moment estimation optimizer. The hyperparameters were manually tuned as follows: regularization factor, 0.002; learning decay rate for the first momentum, 0.9; for the second momentum, 0.999; batch size, 1024; epochs, 30. The DNN was designed as having 2 hidden layers with 32 input units in the first layer and 16 input units in the second layer. Ten percent of the input units were used as the drop-out layers, which evaluate the cross-entropy loss function over the number of epochs and reduce the learning rate when the loss function stopped improving.
The processes of sample preparation, LC separation, and MS detection were the same as the target metabolite analysis mentioned above. The isotope-labeled internal standard used for NAD + detection was adenosine monophosphate C13 (Cambridge Isotope Laboratories, cat # CNLM-3802-SL-10), as described by Grant et al [6].

Determination of mitochondrial complex activity
HASMCs were fractionated by mitochondria isolation kit (Sigma, cat # MITOISO2-1KT). The activity of mitochondrial respiratory chain complexes (I-V) in mitochondria fractions was determined by spectrophotometric methods summarized by Rodenburg RJ [7].

Bioenergetic profiles detected by the Seahorse technology
The Seahorse XFe96 Analyzer (Agilent, Santa Clara, USA) was used to simultaneously detect mitochondrial respiration and glycolysis of HAoSMCs by For measurement of mitochondrial respiration, the medium was replaced with DMEM containing 1 mM pyruvate, 2 mM glutamine, and 25 mM glucose prior to assays. Then, cells were sequentially exposed to 1 µM oligomycin (Oligo, Abcam Biochemicals, cat # 1404-19-9), 0.75 µM FCCP (Sigma, cat # C2920), and a mix of 1 µM each rotenone (Sigma, cat # R8875) and antimycin (Sigma, cat # A8674) (R + A) at the indicated times, and the OCRs of HASMCs were measured over time (as presented in Fig. 5C). According to the manufacturer's protocol, non-mitochondrial OCR was calculated as OCR after R + A injection; Basal respiration was calculated as the difference between the OCR before Oligo injection and non-mitochondrial OCR; ATP-linked respiration was calculated as the difference between basal and Oligo-inhibited OCR. Maximal respiration was calculated as the difference between FCCP-induced OCR and non-mitochondrial respiration. Reserve respiratory capacity was calculated as the difference between maximal and basal respiration.
For measurement of glycolytic flux, the medium was replaced with glucose-free DMEM prior to assays. Then, cells were sequentially exposed to 19 mM glucose, 1 µM Oligo, and 50 mM 2-deoxy-glucose (Sigma, cat # D6134) at the indicated times (as presented in Fig. 5E), and the ECARs of cells were measured over time.
According to the manufacturer's protocol, non-glycolytic acidification was calculated as ECAR before glucose injection; glycolysis was calculated as the difference between glucose-induced ECAR and non-glycolytic acidification; glycolytic capacity was calculated as the difference between Oligo-induced ECAR and non-glycolytic acidification; glycolytic reserve was calculated as the difference between glycolytic capacity and glycolysis. All the readings of OCRs and ECARs were normalized to the total protein content of each well (determined by the Bradford method [8]).

Reverse-transcription quantitative PCR (RT-qPCR)
Total RNA was extracted from HASMCs using the Trizol reagent (Sigma, cat # 93289). cDNA was synthesized with RNA to cDNA EcoDry Premix (Clontech, cat # 639547). The mRNA expression of proinflammatory genes was determined by RT-qPCR on a CFX96 Touch system (Bio-rad, Hercules, USA). RT-qPCR was performed in triplicate based on the MIQE guidelines [9]. The relative expression of a target gene was normalized to the expression of reference gene (GAPDH) using the 2 -△△ Cq method [10]. The calibrator was a cDNA sample from the control group. The primers used for RT-qPCR analysis are listed in Table S2.

Transwell migration assay
HASMCs were seeded into 24-well plates, with administration of FK866, 1MT, or NMN as mentioned above. Transwell inserts with 8 μm pores (Sigma, cat # CLS9668) were pre-coated with 10 μg/mL fibronectin for 24 h, and applied in suspension. THP1 monocytes were labeled with Qdot (Invitrogen, cat # Q10141MP) following the manufacturer's instructions, and then added to the top of each transwell.
After 4 h incubation, transwell inserts were taken, and un-migrated monocytes on the upper membrane were removed by gentle rubbing with a cotton bud. Migrated Qdot-labeled THP1 cells were enumerated by fluorescence microscopy.

Proliferation assay
HASMC proliferation was assessed by a methylene blue assay [11].   Figure S1 The Z distribution of methionine abundance in both discovery and internal validation sets.

Supplementary Figures
The Z scores of methionine across all samples were greater than -3 (-3SD), indicating that the serum samples were properly stored for metabolite detection.

Figure S3
The OPLS-DA analysis assessing the performance of 35 differential metabolites for discrimination of MACEs from matched controls in the discovery (A) and internal validation sets (B).

Figure S4
The differences in cumulative rates of MACEs between participants with high-risk and low risk scores of the 6-metabolite signatures. P values were derived from Cox regression with adjustment for age, sex, smoking status, obesity (BMI > 25 kg/m 2 ), hypertension, HbA1c, LVEF < 50%, clinical presentations, multivessel CAD, SYNTAX score, and stent types.

Supplementary Tables
Table S1 Calibration data for 6 metabolites detected by targeted metabolite analyses.